Rank | Count | Beginning |
---|---|---|
7121 | 519 | यह |
1252 | 364 | इस |
1444 | 184 | इसके |
2835 | 123 | एक |
8727 | 94 | वे |
960 | 92 | इन |
7826 | 92 | ये |
1292 | 77 | इसका |
2526 | 77 | उन्होंने |
8431 | 71 | वह |
4191 | 69 | जब |
3476 | 68 | कुछ |
2072 | 66 | इसी |
6485 | 66 | भारत |
1823 | 62 | इसमें |
7696 | 60 | यहाँ |
7641 | 54 | यहां |
1386 | 52 | इसकी |
2418 | 52 | उनके |
2143 | 48 | इसे |
8248 | 46 | लेकिन |
2644 | 44 | उस |
2376 | 42 | उनकी |
6210 | 42 | बाद |
9834 | 40 | हालांकि |
1025 | 37 | इनके |
2008 | 37 | इससे |
7062 | 37 | यदि |
2678 | 36 | उसके |
2231 | 34 | ईसा |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV